-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

URL (Uniform Resource Locator)

The primary naming scheme used to identify Web resources. URLs define the protocols to be used, the domain name of the Web server where a resource resides, the port address to be used for communication, and a fully-qualified directory pathname on the server where the named Web file or resource can be found.

URLs

Guide to URLs: David Baker's comprehensive "Guide to URLs" which explains all of the Uniform Resource terminology you'll ever care to know.
URLs: Yahoo's list of information on this topic contains nearly everything you need to know
RFC 1630: The RFC about URLs.
URLs: The Virtual Library is Alan Richmond's outstanding compendium of Web- and Internet-related resources; the coverage of URLs and naming is as good as it is for most other Web topics you'll find there.
URLs: A good general overview of URLs and related concepts, by Bruce Gingery, written in the most approachable prose of any of the references mentioned here (see quote below).

W3E References

Address
URI
URL-encoding
URN

Detail

Web resources are identified with special names, called Uniform Resource Identifiers (URIs), that identify objects accessible through the Web. Whenever you attach to a Web page, you do so through its Uniform Resource Locator (URL), which describes the protocol required to access it, and points at its Internet location and home directory. According to Bruce Gingery, URLs are:

Those funny little strings with a hostname stuck in after a colon, the cryptic http: or other short identifier on the front, and a UNIX style path are called Uniform Resource Locators. They are the addresses of the Internet today--with nearly all services addressable in this way. Generally URLs are a line of words, punctuated by slashes, colons, sometimes question-marks or ampersands, and percent signs.

URLs hold the keys to the Web

as you examine a URL, it should look like this:

http://info.cern.ch:nnnn/hypertext/WWW/Addressing/Addressing.html#anchor
|--1--|-----2-----|--3--|---------4--------------|-------5-------|--6-- |

In reality, a URL is composed of six parts, as indicated in the line underneath the URL specification above. These work as follows:

protocol/data source - For network resources, this is usually the name of the protocol used to access the data that resides on the other end of the link. The syntax for this part of the name is:
- "ftp://" This points to a file accessible through the File Transfer Protocol.
- "gopher://" This points to a file system index accessible through the gopher protocol.
- "http://" This points to a hypertext document (typically, an HTML file) accessible through the hypertext transfer protocol.
- "mailto://" This links to an application that allows you to compose a message to be sent to a predefined address, using electronic mail.
- "news:newsgroup-name" This points to a USENET newsgroup, and uses the network news transfer protocol (NNTP) to access the information.
- "telnet://" The links to a remote login on another Internet computer, typically to select from a predefined menu of choices or options.
- "WAIS://" Points to a Wide-Area Information Server (WAIS) on the Internet, and provides access to a system of indexed databases.
- For local data (typically, reading HTML files from one of your desktop machine's hard disks or other drives), the syntax varies from browser to browser, but usually starts with:
  - "file://" which indicates that it's a local file, rather than a Web page elsewhere on the Internet.
domain name - This is the domain name for the Web server where the desired Web page or other resource resides (see the entry on domain names for more information on this subject).
port address - TCP/IP applications like HTTPd (the server software that handles incoming requests for Web services) use a numbered address, called a port address (sometimes called a "socket ID"), to distinguish among the many TCP/IP-based services that a server can offer. This port address makes sure that requests for services go to the right program, and are appropriately handled. The default port address for HTTP is 80, but if you see a port address specified in a URL, it's a good idea to include it when accessing the Web server to which it corresponds (even if it's the default value).
directory path -This is where the desired Web page or resource lives in the Web server's file system. You must always use forward slashes ("/") between directory levels when specifying a URL, no matter how the server itself may actually understand directory specifications (HTTP and the Web server software handle the translation).
object name - This is the name of the HTML file for the desired Web page or the name of whatever other resource is required. For some applications, for example, with FTP, the directory where the files originate ends the URL, and a listing of the directory's contents as a "pick list" is displayed as the contents of a Web page, from which a file can be selected. Sometimes this is hidden from users, and only a particular file can be obtained by selecting a link on a Web page (which then uses FTP in the background to transfer the requested file).
anchor - Within an HTML document, hyperlinks can actually point to specific locations on a page (not just the top of any page). Anchors provide the locations that browsers seek to display at the top of the screen, or centered on the screen, depending on how they're built. Warning: Anchors near the bottom of a document will display the last full screen of information, even if the anchor's very close to the end of the HTML document.

All in all, the most important thing about URLs is to enter them exactly as they're written, to prevent errors. When building Web pages, this means it's absolutely critical to test each and every embedded URL reference to make sure it's correct. When using a Web browser, this means it's better to cut and paste URLs into a hotlist or a text file than to write them out by hand, and introduce the possibility of a transcription error.

For more information on URLs, consult Addressing, this URL describes the details for URL syntax and supported protocols, and points to specifications and other documents on this subject.

URL Syntax and Punctuation

Strictly speaking, the syntax requirements of a URL are such that a colon (":") is only allowed between the protocol or data source identifier and the rest of the name, followed by two forward slash characters, or between the domain name for the server, and it's port address (see below for more information).

For access to local files, we'd recommend that you first look for a menu selection in your browser that will let you search your local file system for an HTML file to open (look for choices like "Open File" or "Open Local File"). If that doesn't work, we've had pretty good results with the following approach:

file:///drive ID|directory path/filename

Notice the three forward slashes after the colon. After the drive ID (which would be a letter for DOS, or the volume name for Macintosh, NetWare, etc.), use a vertical bar character ("|") in place of a colon. Then when specifying the directory path, use forward slashes to separate directory levels. Follow this with the exact name of the file, and you should be able to access it with your browser.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

E-Mail: The World Wide Web Encyclopedia at wwwe@tab.com
E-Mail: Charles River Media at chrivmedia@aol.com
Copyright 1996 Charles River Media. All rights reserved.
Text - Copyright © 1995, 1996 - James Michael Stewart & Ed Tittel.
Web Layout - Copyright © 1995, 1996 - LANWrights &IMPACT Online.
Revised -- February 20th, 1996